Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add actor testcontainer tests #1192

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Conversation

akkie
Copy link

@akkie akkie commented Jan 16, 2025

Description

This PR adds testcontainer based integration tests for actors. This is to make sure that actors really work with testcontainers.

Issue reference

This PR was crated based on a Discord discussion with @salaboy to check if actors works with testcontainers.

@akkie akkie requested review from a team as code owners January 16, 2025 07:03
@akkie
Copy link
Author

akkie commented Jan 16, 2025

@salaboy I have currently the problem that the test fails with the following exception when trying to connect to the Dapr gRPC port.

Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:50001
Caused by: java.net.ConnectException: Connection refused
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)
        at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682)
        at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:1062)
        at io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:336)
        at io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:339)
        at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
        at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994)
        at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:1575)

Not sure why it uses port 50001, because this is the internal port. I think normally it should use the mapped port. Maybe you can have a look, if the test is not correct configured?

@salaboy
Copy link
Contributor

salaboy commented Jan 17, 2025

@akkie thanks a lot of this.. give me some time to look into this.. check the DCO, we need that to approve the PR. Click on the Details link to see the steps to fix it.

@akkie
Copy link
Author

akkie commented Jan 20, 2025

@salaboy Is it OK if I do a force push regarding the update of the DCO?

@salaboy
Copy link
Contributor

salaboy commented Jan 20, 2025

@akkie yeah.. that is your fork.. so it is ok

@akkie akkie force-pushed the actor-testcontainer branch from ee83cff to bec4bd2 Compare January 20, 2025 12:54
@salaboy
Copy link
Contributor

salaboy commented Jan 22, 2025

@akkie would you mind adding me as a collaborator to your fork? I am working on a fix, but I would love to push to your fork.. if not I can send you a patch to apply to your fork with the fix

@akkie
Copy link
Author

akkie commented Jan 22, 2025

@akkie would you mind adding me as a collaborator to your fork? I am working on a fix, but I would love to push to your fork.. if not I can send you a patch to apply to your fork with the fix

Done

@salaboy
Copy link
Contributor

salaboy commented Jan 22, 2025

@akkie I will push two commits to your fork, I am stuck with a new error now.. but at least the connection is working now.

@salaboy
Copy link
Contributor

salaboy commented Jan 22, 2025

Now I am stuck with this:

time="2025-01-22T17:41:56.55236659Z" level=debug msg="api error: code = Internal desc = error invoke actor method: did not find address for actor TestActor/f5893dd3-9d9c-4405-859e-b0c3b6aa2988" app_id=actor-dapr-app instance=780ae8017e66 scope=dapr.runtime.grpc.api type=log ver=1.14.1


io.dapr.exceptions.DaprException: INTERNAL: error invoke actor method: did not find address for actor TestActor/f5893dd3-9d9c-4405-859e-b0c3b6aa2988

But the connection is working as far as I can tell.

@salaboy
Copy link
Contributor

salaboy commented Jan 22, 2025

This is strange.. because it looks like we are hitting this: dapr/dapr#6783

https://github.com/dapr/java-sdk/blob/master/examples/src/main/java/io/dapr/examples/actors/DemoActorClient.java

@artursouza do you know if these examples are executed as part of the tests?

@akkie
Copy link
Author

akkie commented Jan 22, 2025

@salaboy This is exactly the message we get with our testcontainer setup:

fails to send binding event to http app channel, status code: 500 body: Dapr.DaprApiException: error invoke actor method: did not find address for actor

We use .NET and not Java.

@salaboy
Copy link
Contributor

salaboy commented Jan 22, 2025 via email

@salaboy
Copy link
Contributor

salaboy commented Jan 23, 2025

@akkie I am curious.. are you testing with an in-memory statestore?

@akkie
Copy link
Author

akkie commented Jan 23, 2025

@salaboy Yes, we are using Redis.

@salaboy
Copy link
Contributor

salaboy commented Jan 23, 2025

@akkie if you pull the code that I push can you check that you are getting the same results?

@akkie
Copy link
Author

akkie commented Jan 23, 2025

@salaboy You mean running the tests? If I run them, yes, I get the same result:

time="2025-01-23T15:27:19.585808881Z" level=debug msg="api error: code = Internal desc = error invoke actor method: did not find address for actor TestActor/7b7f7132-b3ab-4789-97d6-7dd9cfe6401d" app_id=actor-dapr-app instance=987309611b00 scope=dapr.runtime.grpc.api type=log ver=1.14.1

@salaboy
Copy link
Contributor

salaboy commented Jan 23, 2025

@akkie good news.. i think that I found the issue..
From what I can see there are two different things:

  1. We need to configure the app-port so the sidecar can contact back the application when it needs to execute an actor
  2. The actor runtime where we need to register actors (ActorRuntime.getInstance().registerActor(TestActorImpl.class);) is the one that is failing to get the right port. Because it is creating a new grpc channel without the overrides needed to connect to testcontainers..

I will try to fix this and push again.

@akkie
Copy link
Author

akkie commented Jan 30, 2025

@salaboy Any news regarding the issue?

@salaboy
Copy link
Contributor

salaboy commented Jan 30, 2025 via email

@salaboy
Copy link
Contributor

salaboy commented Jan 30, 2025 via email

@artur-ciocanu
Copy link
Contributor

@salaboy and @akkie I have found what's the culprit. Please check #1202. The TL;DR is that ActorRuntime doesn't allow properties override and that is why we see default GRPC port in logs and issues like actors not being found.

I will try to have a quick fix just to unblock this PR, but in general I think ActorRuntime should be aligned with WorkflowRuntime design for consistency sake.

CC: @artursouza @cicoyle

artur-ciocanu
artur-ciocanu previously approved these changes Feb 3, 2025
Copy link
Contributor

@artur-ciocanu artur-ciocanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akkie and @salaboy overall looks great, but there are some weird formatting issues that would be nice to address.

I think checkstyle will complain about it.

@akkie
Copy link
Author

akkie commented Feb 5, 2025

@artur-ciocanu Thanks for working on that. I fixed the formatting issues. Is there anything I need to do to get the tests running locally? They still failing for me with the message: api error: code = Internal desc = error invoke actor method: did not find address for actor TestActor/789215ad-9328-4019-8c86-6a4b6cc664a7

@salaboy
Copy link
Contributor

salaboy commented Feb 5, 2025

@akkie I think we are hitting the same issue as we hit with PubSub.. we need to investigate this further.. My expectation is that if we run this application in a Kubernetes Cluster, everything works.. so the issue is related with the sequence in which both the app and the sidecar gets bootstrapped by testcontainers.
I've created these examples for other features: #1208; we should add an actor Example there and run the applications locally to see if the issue exists outside of the test containers setup.

@salaboy
Copy link
Contributor

salaboy commented Feb 5, 2025

@akkie also I think you need to take care of the DCO

@salaboy
Copy link
Contributor

salaboy commented Feb 5, 2025

I've created this PR to make sure that I can trigger the pipelines.. but it should contain exactly the same changes as this one: #1204 check the comment about PubSub there.

@akkie akkie force-pushed the actor-testcontainer branch from 8e0f635 to ef9b2fc Compare February 5, 2025 09:15
@akkie
Copy link
Author

akkie commented Feb 5, 2025

@akkie also I think you need to take care of the DCO

Damn. I have fixed it.

@salaboy
Copy link
Contributor

salaboy commented Feb 7, 2025

@akkie @artur-ciocanu Ok.. i found the issue that was also causing the PubSub IT test to fail, thanks to @artursouza to point me to the --app-health-checks flag https://github.com/dapr/java-sdk/pull/1192/files#diff-48e3bf69571df5def11599cd5a3f3dfdbdfe292ead052f95a47a3a1939db5845R254

The problem now, is that we need to wait before starting the tests, as Dapr needs to wait for the app to be healthy to start receiving operations.

There should be a more gracious way to wait. I tried with DaprClient.waitForSideCar(), but I don't think that works as expected either.

If you debug the test, you will see that while the test is sleeping Dapr is finishing the bootstrap, but now at least Dapr pings back the application to check it's health before moving forward with the bootstrap.

@salaboy
Copy link
Contributor

salaboy commented Feb 7, 2025

@akkie I am not sure why this complains about the DCO when I sign my commits.. Can you pull the changes that I pushed to your fork and resign?

akkie and others added 12 commits February 7, 2025 10:47
Signed-off-by: Christian Kaps <[email protected]>
Signed-off-by: Christian Kaps <[email protected]>
Signed-off-by: Christian Kaps <[email protected]>
Signed-off-by: Christian Kaps <[email protected]>
Signed-off-by: Christian Kaps <[email protected]>
Signed-off-by: Christian Kaps <[email protected]>
Co-authored-by: Cassie Coyle <[email protected]>
Signed-off-by: Christian Kaps <[email protected]>
…rs module (dapr#1210)

* feat: Adding basic HTTPEndpoint configuration support in testcontainers module

Signed-off-by: Laurent Broudoux <[email protected]>

* feat: dapr#1209 Adding test for HTTPEndpoint in testcontainers module

Signed-off-by: Laurent Broudoux <[email protected]>

---------

Signed-off-by: Laurent Broudoux <[email protected]>
Signed-off-by: Christian Kaps <[email protected]>
@akkie akkie force-pushed the actor-testcontainer branch from fe76f18 to b27e75b Compare February 7, 2025 09:47
@akkie
Copy link
Author

akkie commented Feb 7, 2025

@akkie I am not sure why this complains about the DCO when I sign my commits.. Can you pull the changes that I pushed to your fork and resign?

Done

@salaboy
Copy link
Contributor

salaboy commented Feb 7, 2025

@akkie if you were trying to do this in dotnet.. I think we found the solution.. do you think that you can give that a try? With @artur-ciocanu we will make sure to merge this tests in the Java SDK

@akkie
Copy link
Author

akkie commented Feb 7, 2025

Thanks @salaboy and everyone else involved for your help to resolve this issue. Sure, I will try to also fix that in our code. If I understand that correctly, then the only thing I need to do is to call client.WaitForSidecarAsync to wait for the sidecar to be available?

https://docs.dapr.io/developing-applications/sdks/dotnet/dotnet-client/#wait-for-sidecar

@salaboy
Copy link
Contributor

salaboy commented Feb 8, 2025

@akkie that didn't worked for me.. there are a few important things here:

  1. Actor runtime picking up the correct ports to communicate with the sidecar.. that was sorted out
  2. using the --enable-health-checks flag and the health-check-path, which enable the sidecar to check that the app is ready before finishing the bootstrap
  3. The tests need to wait for the health-check to pass to finish initialization, we might need to add a new waitForSidecar() method.. because I think right now, that validation is waiting for the full initialization to happen (after the health check validate that the app is up) -> https://github.com/dapr/java-sdk/blob/master/sdk-tests/src/test/java/io/dapr/it/resiliency/WaitForSidecarIT.java#L56 <- maybe we need to extend these tests

@artur-ciocanu
Copy link
Contributor

artur-ciocanu commented Feb 8, 2025

@akkie and @salaboy I wanted to share a few observations, it is related to my other PR #1213.

With @artursouza and @salaboy help here is what we know:

  • ActorRuntime needed a way to override ports - this is part of this PR, so we are 🟢
  • DaprContainer needed a way to configure app health check - this is part of this PR Add app health check support to Dapr Testcontainer #1213 so we are 🟢
  • DaprContainer and SpringBootTest bootstrap is a little bit funky, here is my understanding of it:
    1. JUnit runner starts the Dapr container
    2. Dapr container waits for the Spring Boot application to start - it should be noted that just because the app started and container is running, it doesn't mean that everything has been properly registered in Dapr like actors, subscriptions, etc
    3. JUnit test is run, at this point we could have race condition since the Dapr could still be registering and doing other initialization.

To avoid the race condition one option is to use a "dumb" Thread.sleep(...), my proposal is to use Testcontainer WaitStrategy something like:

  @Test
  public void testActors() {
    Wait.forLogMessage(".*Actor runtime started.*", 1)
        .waitUntilReady(DAPR_CONTAINER);
    
    ....    
 }

This will ensure that the test will start ONLY when actor runtime is properly started. It is still not the best user experience, I would like to hide it somewhere, but it is at least inline with Testcontainers and leveraging its capabilities. A better option would be to move the Wait call to @BeforeEach method to ensure we have everything ready when the test starts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants